AITopics | lifelong reinforcement learning

Collaborating Authors

lifelong reinforcement learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing SystemsDec-25-2025, 22:49:06 GMT

In this paper we consider the problem of how a reinforcement learning agent that is tasked with solving a sequence of reinforcement learning problems (a sequence of Markov decision processes) can use knowledge acquired early in its lifetime to improve its ability to solve new problems. We argue that previous experience with similar problems can provide an agent with information about how it should explore when facing a new but related problem. We show that the search for an optimal exploration strategy can be formulated as a reinforcement learning problem itself and demonstrate that such strategy can leverage patterns found in the structure of related problems. We conclude with experiments that show the benefits of optimizing an exploration strategy using our proposed framework.

exploration, lifelong reinforcement learning, meta-mdp approach, (6 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement Learning

Neural Information Processing SystemsMay-27-2025, 02:26:48 GMT

A key challenge in lifelong reinforcement learning (RL) is the loss of plasticity, where previous learning progress hinders an agent's adaptation to new tasks. While regularization and resetting can help, they require precise hyperparameter selection at the outset and environment-dependent adjustments. Building on the principled theory of online convex optimization, we present a parameter-free optimizer for lifelong RL, called TRAC, which requires no tuning or prior knowledge about the distribution shifts. Extensive experiments on Procgen, Atari, and Gym Control environments show that TRAC works surprisingly well--mitigating loss of plasticity and rapidly adapting to challenging distribution shifts--despite the underlying optimization problem being nonconvex and nonstationary.

artificial intelligence, lifelong reinforcement learning, machine learning, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Reviews: A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing SystemsJan-26-2025, 21:33:04 GMT

I am on the fence for this paper. I like that the approach is simple. I like that it was tested in different domains. I like that several different views of performance are included. Nevertheless I have many questions, that need resolution before I can increase my score.

error bar, exploration policy, lifelong reinforcement learning, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.52)

Add feedback

Reviews: A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing SystemsJan-26-2025, 21:32:54 GMT

Even after the discussion and the author response there was still some disagreement between the reviewers. The paper proposes a simple yet novel and very interesting idea. There still are a few concerns about clarity, but those can be fixed in the final version (see updated reviews). Overall this is a solid paper, that (as always) would benefit from more thorough empirical evaluation. One reviewer proposed to add an additional baseline of a domain-randomized robust policy that is trained on various tasks.

exploration, lifelong reinforcement learning, meta-mdp approach, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Statistical Guarantees for Lifelong Reinforcement Learning using PAC-Bayesian Theory

Zhang, Zhi, Chow, Chris, Zhang, Yasi, Sun, Yanchao, Zhang, Haochen, Jiang, Eric Hanchen, Liu, Han, Huang, Furong, Cui, Yuchen, Padilla, Oscar Hernan Madrid

arXiv.org Artificial IntelligenceNov-1-2024

Lifelong reinforcement learning (RL) has been developed as a paradigm for extending single-task RL to more realistic, dynamic settings. In lifelong RL, the "life" of an RL agent is modeled as a stream of tasks drawn from a task distribution. We propose EPIC (\underline{E}mpirical \underline{P}AC-Bayes that \underline{I}mproves \underline{C}ontinuously), a novel algorithm designed for lifelong RL using PAC-Bayes theory. EPIC learns a shared policy distribution, referred to as the \textit{world policy}, which enables rapid adaptation to new tasks while retaining valuable knowledge from previous experiences. Our theoretical analysis establishes a relationship between the algorithm's generalization performance and the number of prior tasks preserved in memory. We also derive the sample complexity of EPIC in terms of RL regret. Extensive experiments on a variety of environments demonstrate that EPIC significantly outperforms existing methods in lifelong RL, offering both theoretical guarantees and practical efficacy through the use of the world policy.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2411.00401

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.67)

Add feedback

A Meta-MDP Approach to Exploration for Lifelong Reinforcement Learning

Neural Information Processing SystemsOct-10-2024, 21:11:56 GMT

exploration, lifelong reinforcement learning, meta-mdp approach, (3 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.58)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

CHIRPs: Change-Induced Regret Proxy metrics for Lifelong Reinforcement Learning

Birkbeck, John, Sobey, Adam, Cerutti, Federico, Flynn, Katherine Heseltine Hurley, Norman, Timothy J.

arXiv.org Artificial IntelligenceSep-5-2024

Reinforcement learning agents can achieve superhuman performance in static tasks but are costly to train and fragile to task changes. This limits their deployment in real-world scenarios where training experience is expensive or the context changes through factors like sensor degradation, environmental processes or changing mission priorities. Lifelong reinforcement learning aims to improve sample efficiency and adaptability by studying how agents perform in evolving problems. The difficulty that these changes pose to an agent is rarely measured directly, however. Agent performances can be compared across a change, but this is often prohibitively expensive. We propose Change-Induced Regret Proxy (CHIRP) metrics, a class of metrics for approximating a change's difficulty while avoiding the high costs of using trained agents. A relationship between a CHIRP metric and agent performance is identified in two environments, a simple grid world and MetaWorld's suite of robotic arm tasks. We demonstrate two uses for these metrics: for learning, an agent that clusters MDPs based on a CHIRP metric achieves $17\%$ higher average returns than three existing agents in a sequence of MetaWorld tasks. We also show how a CHIRP can be calibrated to compare the difficulty of changes across distinctly different environments.

agent, mdp, sopr, (15 more...)

arXiv.org Artificial Intelligence

2409.03577

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Research Report (0.84)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Interview with Safa Alver: Scalable and robust planning in lifelong reinforcement learning

AIHubJun-15-2023, 09:49:40 GMT

In their paper Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning, Safa Alver and Doina Precup introduced special kinds of models that allow for performing scalable and robust planning in lifelong reinforcement learning scenarios. In this interview, Safa Alver tells us more about this work. It has long been argued that in order for reinforcement learning (RL) agents to perform well in lifelong RL (LRL) scenarios (which are scenarios like the ones we, biological agents, encounter in real life), they should be able to learn a model of their environment, which allows for advanced computational abilities such as counterfactual reasoning and fast re-planning. Even though this is a widely accepted view in the community, the question of what kinds of models would be better suited for performing LRL still remains unanswered. As LRL scenarios involve large environments with lots of irrelevant aspects and environments with unexpected distribution shifts, directly applying the ideas developed in the classical model-based RL literature to these scenarios is likely to lead to catastrophic results in building scalable and robust lifelong learning agents.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

AIHub

Genre: Research Report > New Finding (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Towards Continual Reinforcement Learning: A Review and Perspectives

Khetarpal, Khimya | Riemer, Matthew (a:1:{s:5:"en_US";s:42:"IBM Research, Mila, University of Montreal";}) | Rish, Irina | Precup, Doina

Journal of Artificial Intelligence ResearchDec-22-2022

In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations by mathematically characterizing two key properties of non-stationarity, namely, the scope and driver non-stationarity. This offers a unified view of various formulations. Next, we review and present a taxonomy of continual RL approaches. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks used in the literature and important metrics for understanding agent performance. Finally, we highlight open problems and challenges in bridging the gap between the current state of continual RL and findings in neuroscience. While still in its early days, the study of continual RL has the promise to develop better incremental reinforcement learners that can function in increasingly realistic applications where non-stationarity plays a vital role. These include applications such as those in the fields of healthcare, education, logistics, and robotics.

autonomous agent and multiagent system, lifelong reinforcement learning, model-based reinforcement learning, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13673

AI Access Foundation

13673

Journal of Artificial Intelligence Research

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre:

Overview (1.00)
Research Report (0.92)
Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Leisure & Entertainment > Games (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(5 more...)

Add feedback

Lifelong Machine Learning of Functionally Compositional Structures

Mendez, Jorge A.

arXiv.org Artificial IntelligenceJul-25-2022

A hallmark of human intelligence is the ability to construct self-contained chunks of knowledge and reuse them in novel combinations for solving different problems. Learning such compositional structures has been a challenge for artificial systems, due to the underlying combinatorial search. To date, research into compositional learning has largely proceeded separately from work on lifelong or continual learning. This dissertation integrated these two lines of work to present a general-purpose framework for lifelong learning of functionally compositional structures. The framework separates the learning into two stages: learning how to combine existing components to assimilate a novel problem, and learning how to adapt the existing components to accommodate the new problem. This separation explicitly handles the trade-off between stability and flexibility. This dissertation instantiated the framework into various supervised and reinforcement learning (RL) algorithms. Supervised learning evaluations found that 1) compositional models improve lifelong learning of diverse tasks, 2) the multi-stage process permits lifelong learning of compositional knowledge, and 3) the components learned by the framework represent self-contained and reusable functions. Similar RL evaluations demonstrated that 1) algorithms under the framework accelerate the discovery of high-performing policies, and 2) these algorithms retain or improve performance on previously learned tasks. The dissertation extended one lifelong compositional RL algorithm to the nonstationary setting, where the task distribution varies over time, and found that modularity permits individually tracking changes to different elements in the environment. The final contribution of this dissertation was a new benchmark for compositional RL, which exposed that existing methods struggle to discover the compositional properties of the environment.

book review, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2207.12256

Country:

North America > United States (0.45)
Europe > United Kingdom (0.27)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(2 more...)

Industry:

Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
Education > Educational Setting > Online (0.67)
Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback